7 research outputs found
Exoplanet validation with machine learning : 50 new validated Kepler planets
Over 30% of the ∼4000 known exoplanets to date have been discovered using ‘validation’, where the statistical likelihood of a transit arising from a false positive (FP), non-planetary scenario is calculated. For the large majority of these validated planets calculations were performed using the vespa algorithm (Morton et al. 2016). Regardless of the strengths and weaknesses of vespa, it is highly desirable for the catalogue of known planets not to be dependent on a single method. We demonstrate the use of machine learning algorithms, specifically a gaussian process classifier (GPC) reinforced by other models, to perform probabilistic planet validation incorporating prior probabilities for possible FP scenarios. The GPC can attain a mean log-loss per sample of 0.54 when separating confirmed planets from FPs in the Kepler threshold crossing event (TCE) catalogue. Our models can validate thousands of unseen candidates in seconds once applicable vetting metrics are calculated, and can be adapted to work with the active TESS mission, where the large number of observed targets necessitates the use of automated algorithms. We discuss the limitations and caveats of this methodology, and after accounting for possible failure modes newly validate 50 Kepler candidates as planets, sanity checking the validations by confirming them with vespa using up to date stellar information. Concerning discrepancies with vespa arise for many other candidates, which typically resolve in favour of our models. Given such issues, we caution against using single-method planet validation with either method until the discrepancies are fully understood
MILD-Net: Minimal Information Loss Dilated Network for Gland Instance Segmentation in Colon Histology Images
The analysis of glandular morphology within colon histopathology images is an
important step in determining the grade of colon cancer. Despite the importance
of this task, manual segmentation is laborious, time-consuming and can suffer
from subjectivity among pathologists. The rise of computational pathology has
led to the development of automated methods for gland segmentation that aim to
overcome the challenges of manual segmentation. However, this task is
non-trivial due to the large variability in glandular appearance and the
difficulty in differentiating between certain glandular and non-glandular
histological structures. Furthermore, a measure of uncertainty is essential for
diagnostic decision making. To address these challenges, we propose a fully
convolutional neural network that counters the loss of information caused by
max-pooling by re-introducing the original image at multiple points within the
network. We also use atrous spatial pyramid pooling with varying dilation rates
for preserving the resolution and multi-level aggregation. To incorporate
uncertainty, we introduce random transformations during test time for an
enhanced segmentation result that simultaneously generates an uncertainty map,
highlighting areas of ambiguity. We show that this map can be used to define a
metric for disregarding predictions with high uncertainty. The proposed network
achieves state-of-the-art performance on the GlaS challenge dataset and on a
second independent colorectal adenocarcinoma dataset. In addition, we perform
gland instance segmentation on whole-slide images from two further datasets to
highlight the generalisability of our method. As an extension, we introduce
MILD-Net+ for simultaneous gland and lumen segmentation, to increase the
diagnostic power of the network.Comment: Initial version published at Medical Imaging with Deep Learning
(MIDL) 201
Newcomb-Benford Law as a generic flag for changes in the derivation of long-term solar terrestrial physics timeseries
The Newcomb-Benford Law (NBL) prescribes the probability distribution of the first digit of variables which explore a broad range under conditions including aggregation. Long-term space weather relevant observations and indices necessarily incorporate changes in the contributing number and types of observing instrumentation over time and we find that this can be detected solely by comparison with the NBL. It detects when upstream solar wind magnetic field OMNI High Resolution (HRO) Interplanetary Magnetic Field incorporated new data from the WIND and Advanced Composition Explorer (ACE) spacecraft after 1995. NBL comparison can detect underlying changes in the geomagnetic Auroral Electrojet (AE) index (activity dependent background subtraction) and the SuperMAG Electrojet (SME) index (different station types) that select individual stations showing the largest deflection, but not where station data are averaged, as in the SuperMAG Ring Current (SMR) index. As composite indices become more widespread across the geosciences, the NBL may provide a generic, data processing independent flag indicating changes in the constituent raw data, calibration or sampling method